The early 20th century brought forth a surge of interest from mathematicians, computer scientists, and neuroscientists in formalizing the notion of computability. These cross-disciplinary efforts yielded a slew of characterizations of computational systems such as the Turing Machine and the Lambda Calculus. One of the lesser known outcomes of this movement was a neuronal model proposed by Warren McCulloch and Walter Pitts. It served as the precursor to the neural networks and machine learning tools of today, and it exhibits surprising computational scope and power given its formal simplicity.
Simply put, a McCulloch-Pitts neuron consists of the following:
Given this definition, four natural questions arise.
We will provide an answer to each of these in the following sections of this tutorial. But first, let's take a look at how the McCulloch-Pitts neuron can be implemented in code. Retina provides its own MPNeuron
and MPInput
classes in the machine learning module. They are reproduced here.
In [2]:
class MPNeuron(object):
def __init__(self, threshold, inputs):
self.threshold = threshold
self.inputs = inputs
def activate(self):
excitations = 0
for trigger in self.inputs:
if trigger.excitatory:
excitations += trigger.value
else:
if trigger.value:
return 0
if excitations >= self.threshold:
return 1
return 0
class MPInput(object):
def __init__(self, excitatory):
self.excitatory = excitatory
self.value = 0
def trigger(self, value):
self.value = value
Here, the MPNeuron
corresponds to the "node" concept described above. Each neuron is constructed with a threshold value and a list of MPInput
objects. Each of these objects is constructed by specifying excitatory=True
or excitatory=False
. A value of False
indicates that the input is inhibitory. trigger
ing the input assigns it a value.
The activate
function of the MPNeuron
class sums the values of all the inputs to the neuron and returns 1 if the neuron fires and 0 otherwise.
It turns out that every $n$-ary logical function can be computed by a McCulloch-Pitts network composed of a sufficient number of units. For a proof of this, we will appeal to a standard result from propositional logic: namely, that any set of adequate connectives is sufficient for expressing any given truth table. It is well known that the binary boolean operators AND and NOT are adequate, so to show that any n-ary logical function can be expressed by some McCulloch-Pitts network, we need only show that we can devise neurons equivalent to the AND and OR operators. Then, any logical function can be constructed by linking together some number of these units in a sensible way.
Pictured below are representations of three different McCulloch-Pitts neurons capable of computing the binary logical functions AND, OR, and NOT, respectively.
In each case $x_1$ and $x_2$ (just $x_1$ in the case of NOT) are the inputs to the units. The circle at the terminal end of an input indicates that it is inhibitory. All other inputs are excitatory.
We can verify that these units compute the functions they claim to by comparing their outputs with the truth table for each function. Let's start with AND. The truth table is given by:
$x_1$ | $x_2$ | AND($x_1$, $x_2$) |
---|---|---|
0 | 0 | 0 |
1 | 0 | 0 |
0 | 1 | 0 |
1 | 1 | 1 |
Now, to compute the output of the neuron:
$x_1 = 0$, $x_2 = 0$, and $x_1 + x_2 = 0 + 0 = 0 < 2$ so the neuron outputs 0.
$x_1 = 1$, $x_2 = 0$, and $x_1 + x_2 = 1 + 0 = 1 < 2$ so the neuron outputs 0.
$x_1 = 0$, $x_2 = 1$, and $x_1 + x_2 = 0 + 1 = 1 < 2$ so the neuron outputs 0.
$x_1 = 1$, $x_2 = 1$, and $x_1 + x_2 = 1 + 1 = 2$ so the neuron outputs 1.
Now, let's do the same for NOT. The truth table is given by:
$x_1$ | NOT($x_1$) |
---|---|
0 | 1 |
1 | 0 |
Now, to compute the output of the corresponding neuron.
If $x_1 = 0$ then the neuron is not inhibited and the total excitation is 0 which is equal to the threshold of 0. Thus the neuron outputs 1.
If $x_1 = 1$ then the neuron is inhibited and does not fire, outputting 0.
Finally, we'll check the consistency of the two definitions of OR.
$x_1$ | $x_2$ | OR($x_1$, $x_2$) |
---|---|---|
0 | 0 | 0 |
1 | 0 | 1 |
0 | 1 | 1 |
1 | 1 | 1 |
Now, to compute the output of the neuron:
$x_1 = 0$, $x_2 = 0$, and $x_1 + x_2 = 0 + 0 = 0 < 1$ so the neuron outputs 0.
$x_1 = 1$, $x_2 = 0$, and $x_1 + x_2 = 1 + 0 = 1$ so the neuron outputs 1.
$x_1 = 0$, $x_2 = 1$, and $x_1 + x_2 = 0 + 1 = 1$ so the neuron outputs 1.
$x_1 = 1$, $x_2 = 1$, and $x_1 + x_2 = 1 + 1 > 2$ so the neuron outputs 1.
Since $\{AND, NOT\}$ is adequate, this proves that any logical function can be computed by a network of McCulloch-Pitts neurons.
Now that we have abstractly specified the configuration of the AND and NOT neurons, lets see how to implement them using the MPNeuron
and MPInput
classes defined above.
In [3]:
def AND(x1, x2):
inputs = [MPInput(True), MPInput(True)]
gate = MPNeuron(2, inputs)
inputs[0].trigger(x1)
inputs[1].trigger(x2)
return gate.activate()
def OR(x1, x2):
inputs = [MPInput(True), MPInput(True)]
gate = MPNeuron(1, inputs)
inputs[0].trigger(x1)
inputs[1].trigger(x2)
return gate.activate()
def NOT(x):
inputs = [MPInput(False)]
gate = MPNeuron(0, inputs)
inputs[0].trigger(x)
return gate.activate()
Let's test these functions on a few representative inputs.
In [4]:
NOT(0)
Out[4]:
In [5]:
NOT(1)
Out[5]:
In [7]:
AND(0, 0)
Out[7]:
In [8]:
AND(0, 1)
Out[8]:
In [9]:
AND(1, 0)
Out[9]:
In [10]:
AND(1, 1)
Out[10]:
In [11]:
OR(0, 0)
Out[11]:
In [12]:
OR(0, 1)
Out[12]:
In [13]:
OR(1, 0)
Out[13]:
In [14]:
OR(1, 1)
Out[14]:
The requirement of inhibition is necessary for the functional completeness of the McCulloch-Pitts neuron. In fact, it is the case that the uninhibited threshold logic can only implement monotonic logical functions.
Proof (Taken from Neural Networks by Raul Rojas)
Assume that the input vector $(1, 1, \ldots, 1)$ is assigned the function value 0. Since no other vector can set more edges in the network to 1 than this vector does, any other input vector can also only be evaluated to 0. In general, if the ones in the input vector $y$ are a subset of the ones in the input vector $x$, then the first cannot set more edges to 1 than $x$ does. This implies that $f(x) \geq f(y)$, as had to be shown.
It is the case that allowing for relative inhibition of McCulloch-Pitts neurons does not expand the class of functions computable by their networks.
Proof: To be added soon.
Since any logical function can be computed by a McCulloch-Pitts network, we now seek to generalize the converse. Given an arbitrary logical function, how can we construct a McCulloch-Pitts network to compute it?
It should be evident that any $n$-ary logical function is completely determined by specifying those inputs which are the preimage of 1. For a given function $f:\{0, 1\}^n \to \{0, 1\}$, consider an argument $(x_1, \ldots, x_n)$ which is a preimage of 1. We can construct a McCulloch-Pitts neuron which fires when presented with this and only this input as follows. Define
Now, say we want to construct a network to compute $f$. To accomplish this, we generate the McCulloch-Pitts neuron defined above for each $(x_1, \ldots, x_n)$ in the preimage of 1. Say there are $m$ vectors in the preimage of 1. We then feed the output of these $m$ units to a $m$-ary OR neuron which can be defined in the following way
The resulting network computes $f$, so this concludes the proof. Now, let's see how we can define such a network constructor in code. In Retina, we have implemented such a class and called it a Decoder
.
In [15]:
class Decoder(object):
def __init__(self, vectors):
self.vectors = vectors
self.vec_length = len(self.vectors[0])
assert(len(vec) == self.vec_length for vec in vectors)
def decode(self):
decoder_units = []
for vector in self.vectors:
threshold = sum(vector)
inputs = []
for i in range(self.vec_length):
if vector[i] == 1:
inputs.append(MPInput(True))
else:
inputs.append(MPInput(False))
gate = MPNeuron(threshold, inputs)
decoder_units.append(gate)
def decoder(*args):
for i in range(self.vec_length):
inputs[i].trigger(args[i])
decoder_units.reverse()
or_arg = decoder_units[0].activate()
for unit in decoder_units:
for i in range(self.vec_length):
unit.inputs[i].trigger(args[i])
val = unit.activate()
or_arg = OR(or_arg, val)
return or_arg
return decoder
The Decoder
class is instantiated with a list of vectors in the preimage of 1. When the decode
function is called, the logical function defined by this preimage is returned. Let's see how this works. Suppose we want to generate a logical function which sends $(0, 1, 0)$, $(0, 1, 1)$, and $(1, 0, 1)$ to 1 and all other 3-ary inputs to 0. We can accomplish this with the following code.
In [16]:
decoder = Decoder([[0, 1, 0], [0, 1, 1], [1, 0, 1]])
f = decoder.decode()
In [17]:
f(0, 1, 0)
Out[17]:
In [18]:
f(0, 1, 1)
Out[18]:
In [19]:
f(1, 0, 1)
Out[19]:
In [20]:
f(0, 0, 0)
Out[20]:
In [21]:
f(0, 0, 1)
Out[21]:
In [23]:
f(1, 0, 0)
Out[23]:
In [24]:
f(1, 1, 0)
Out[24]:
In [25]:
f(1, 1, 1)
Out[25]:
And there you have it, concrete proof that any logical function can be computed using a network of McCulloch-Pitts neurons!